https://nova.newcastle.edu.au/vital/access/ /manager/Index en-au 5 Extended Anova and rank transform procedures https://nova.newcastle.edu.au/vital/access/ /manager/Repository/uon:14147 Wed 11 Apr 2018 16:16:23 AEST ]]> Extended analysis of at least partially ordered multi-factor ANOVA https://nova.newcastle.edu.au/vital/access/ /manager/Repository/uon:27271 Wed 04 Sep 2019 10:39:29 AEST ]]> Picking items for experimental sets: measures of similarity and methods for optimisation https://nova.newcastle.edu.au/vital/access/ /manager/Repository/uon:28931 wl denote the number of letters, l, in word w. One approach to measuring the similarity of the two sets is to compare the average value of the attribute across the sets, i.e. measure based on the difference |1/B₁∑w∈B₁ f,sub>wl - 1/B₂∑w∈B₂ f,sub>wl|. However it is well known that very different distributions can have the same average value. For example, defining the attribute value count vectors ηBic = |{w ∈ Bi : f,sub>wl = c}| for each i = 1, 2 and each positive integer value c that could be the length of a word, say ranging from 1 to 5 letters. This approach would consider two sets with ηBi = (3, 3, 3, 3, 3) and ηB2 = (0, 0, 15, 0, 0) to be very similar, whereas clearly the experience of a human subject to these two sets might be very different: the former has an even spread of word lengths whereas the latter has all words of identical length. The existing metaheuristics address this issue by using group characteristics, such as average or standard deviation, which take into account the relative values of the heuristics. However, as we have shown, these group characteristics do not adequately measure the similarity of the sets. Recent MIP approaches measure similarity between sets using the entire histogram, i.e. they measure based on the difference |ηB1c − ηB2c| for each c. Whilst this provides a richer measure of similarity than simple averages, it does not take into account the relationships between attribute values. To return to the word length illustration, the length count vectors (0, 3, 3, 3, 6) and (3, 4, 5, 0, 3) are “equally” different from (3, 3, 3, 3, 3) component-wise. But it is common sense that words of length 2 or 3 are more similar to words of length 4 than words of length 1 are to words of length 5, so the vector (0, 3, 3, 3, 6) “replacing” three words of length 1 with three of length 5 is less similar to (3, 3, 3, 3, 3) than is (3, 4, 5, 0, 3), which “replaces” three words of length 4 with two of length 3 and one of length 2. The component-wise histogram measure does not take into account similarities and differences between attribute values. This paper briefly reviews the existing approaches to automate picking items for experimental sets, and then discusses new MIP approaches that address the entire distribution of attribute values across sets while also taking into account the relationships between attribute values. Numerical results on psycholinguistic data sets are analysed, and the alternative approaches compared.]]> Sat 24 Mar 2018 07:31:28 AEDT ]]>